Confidence for Speaker Diarization using PCA Spectral Ratio

نویسندگان

  • Orith Toledo-Ronen
  • Hagai Aronowitz
چکیده

Confidence scoring is an important component in speaker diarization systems, both for offline speech analytics and for online diarization that are required to produce the speaker segmentation from very little audio. This paper proposes a confidence measure for speaker diarization based on the spectral ratio of the eigenvalues of the Principal Component Analysis (PCA) transformation computed on the pre-segmented audio before diarization is performed on the conversation. We tested our method on two-speaker data and our results show the effectiveness of the PCA’s spectral ratio confidence measure for both offline and online diarization. We compare and contrast our proposed confidence measure with other clustering validation methods that provide a quantitative measure of the segmentation quality but are calculated on the segmented data after diarization is performed, and with a related approach that extracts a confidence from the PCA of the pre-segmented audio.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two-Pass IB Based Speaker Diarization System Using Meeting-Specific ANN Based Features

In this paper, we present a two-pass Information Bottleneck (IB) based system for speaker diarization which uses meetingspecific artificial neural network (ANN) based features. We first use IB based speaker diarization system to get the labelled speaker segments. These segments are re-segmented using Kullback-Leibler Hidden Markov Model (KL-HMM) based re-segmentation. The multi-layer ANN is the...

متن کامل

Speaker Diarization Based on Gmm Supervectors and Unsupervised Intra-speaker Variability Modeling

This paper presents a novel framework for speaker diarization. Audio is parameterized by a sequence of GMM-supervectors representing overlapping short segments of speech. Session dependent intra-session intra-speaker variability is estimated online in an unsupervised manner, and is removed from the supervectors using Nuisance Attribute Projection (NAP) The supervectors are then projected using ...

متن کامل

Trainable speaker diarization

This paper presents a novel framework for speaker diarization. We explicitly model intra-speaker inter-segment variability using a speaker-labeled training corpus and use this modeling to assess the speaker similarity between speech segments. Modeling is done by embedding segments into a segment-space using kernel-PCA, followed by explicit modeling of speaker variability in the segment-space. O...

متن کامل

Exploiting Intra-Conversation Variability for Speaker Diarization

In this paper, we propose a new approach to speaker diarization based on the Total Variability approach to speaker verification. Drawing on previous work done in applying factor analysis priors to the diarization problem, we arrive at a simplified approach that exploits intra-conversation variability in the Total Variability space through the use of Principal Component Analysis (PCA). Using our...

متن کامل

Integration of TDOA features in information bottleneck framework for fast speaker diarization

In this paper we address the combination of multiple feature streams in a fast speaker diarization system for meeting recordings. Whenever Multiple Distant Microphones (MDM) are used, it is possible to estimate the Time Delay of Arrival (TDOA) for different channels. In [9], it is shown that TDOA can be used as additional features together with conventional spectral features for improving speak...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012